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Nucleotide sequence of <pC31mf l ' NT 

1 ATGGCACAAG GGGTTGTGAC CGGGGTGGAT ACGTAAGTTT CTGCTFCTAC CTTTGATATA 
61 TATATAATAA TTATCATTAA TTAGTAGTAA TATAATA1TT CAAATATTTT TTTCAAAATA 
121 AAAGAATGTA GTATATAGCA ATrGCTTTTC TGTAGTTTAT AAGTGTGTAT ATTTTAATTT 
181 ATAACTTTTC TAATATATGA CCAAAATTTG TTGATGTGCA GGTACGCGGG TGCTTACGAC 
241 CGTCAGTCGC GCGAGCGCGA GAATTCGAGC GCAGCAAGCC CAGCGACACA GCGTAGCGCC 
301 AACGAAGACA AGGCGGCCGA CCTTCAGCGC GAAGTCGAGC GCGACGGGGG CCGGTTCAGG 
361 TTCGTCGGGC ATTTCAGCGA AGCGCCGGGC ACGTCGGCGT TCGGGACGGC GGAGCGCCCG 
421 GAGTTCGAAC GCATCCTGAA CGAATGCCGC GCCGGGCGGC TCAACATGAT CATTGTCTAT 
481 GACGTGTCGC GCTTCTCGCG CCTGAAGGTC ATGGACGCGA TTCCGATTGT CTCGGAATTG 
541 CTCGCCCTGG GCGTGACGAT TGTTTCCACT CAGGAAGGCG TCTTCCGGCA GGGAAACGTC 
601 ATGGACCTGA TTCACCTGAT TATGCGGCTC GACGCGTCGC ACAAAGAATC TTCGCTG A AG 
661 TCGGCGAAGA TTCTCGACAC GAAGAACCTT CAGCGCGAAT TGGGCGGGTA CGTCGGCGGG 
721 AAGGCGCCTT ACGGCTTCGA GCTTGTTTCG GAGACGAAGG AGATCACGCG CAACGGCCGA 
781 ATGGTCAATG TCGTCATCAA CAAGCTTGCG CACTCGACCA CTCCCCTTAC CGGACCCTTC 
841 GAGTTCGAGC CCGACGTAAT CCGGTGGTGG TGGCGTGAGA TCAAGACGCA CAAACACCTT 
901 CCCTTCAAGC CGGGCAGTCA AGCCGCCATT CACCCGGGCA GCATCACGGG GCTTTGTAAG 
961 CGCATGGACG CTGACGCCGT GCCGACCCGG GGCGAGACGA TTGGGAAGAA GACCGCTTCA 
1021 AGCGCCTGGG ACCCGGCAAC CGTTATGCGA ATCCTTCGGG ACCCGCGTAT TGCGGGCTTC 
1 08 1 GCCGCTG AGG TGATCTACAA GA AGA AGCCG G ACGGCACGC CGACCACGA A G ATTG AGGGT 
1141 TACCGCATTC AGCGCGACCC GATCACGCTC CGGCCGGTCG AGCTTGATTG CGGACCGATC 
1201 ATCGAGCCCG CTGAGTGGTA TGAGCTTCAG GCGTGGTTGG ACGGCAGGGG GCGCGGCAAG 
1261 GGGCTTTCCC GGGGGCAAGC CATTCTGTCC GCCATGGACA AGCTGTACTG CGAGTGTGGC 
1 32 1 GCCGTCATGA CTTCGA AGCG CGGGGAAGA A TCGATCA AGG ACTCTTACCG CTGCCGTCGC 
1381 CGGAAGGTGG TCGACCCGTC CGCACCTGGG CAGCACGAAG GCACGTGCA A CGTCAGCATG 
1441 GCGGCACTCG ACAAGTTCGT TGCGGAACGC ATCTTCAACA AGATCAGGCA CGCCGAAGGC 
1501 GACGAAGAGA CGTTGGCGCT TCTGTGGGAA GCCGCCCGAC GCTTCGGCAA GCTCACTGAG 
1561 GCGCCTGAGA AGAGCGGCGA ACGGGCGAAC CTTGTTGCGG AGCGCGCCGA CGCCCTGAAC 
1621 GCCCTTGAAG AGCTGTACGA AGACCGCGCG GCAGGCGCGT ACGACGGACC CGTTGGCAGG 
1 68 1 AAGCACTTCC GGAAGCA ACA GGCAGCGCTG ACGCTCCGGC AGCA AGGGGC GGAAGAGCGG 
1741 CTTGCCGAAC TTGAAGCCGC CGAAGCCCCG AAGCTTCCCC TTGACCAATG GTTCCCCGAA 
1 801 GACGCCGACG CTGACCCGAC CGGCCCTAAG TCGTGGTGGG GGCGCGCGTC AGTAGACGAC 
1 86 1 AAGCGCGTGT TCGTCGGGCT CTTCGTAG A C AAGATCGTTG TCACGA AGTC GACTACGGGC 
1921 AGGGGGCAGG GAACGCCCAT CGAGAAGCGC GCTTCGATCA CGTGGGCGAA GCCGCCGACC 
1 98 1 G ACGACGACG A AGACGACGC CCAGGACGGC ACGGAAGACG TAGCGGCGTA G 
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Nucleotide sequence of q>C31mf* 

1 ATGGCACAAG GGGTTGTGAC CGGGGTGGAT ACGTAAGTTT CTGCTTCTAC CTTTGATATA 
61 TA TATA ATA A TTATCATTAA TTAGTAGTAA TATAATATTT CAAATATTTT TTTCAAAATA 
I 2 1 AAAGA ATGTA GTATATAGCA ATTGCTTTTC TGTAGTTTAT AAGTGTGTAT ATTTTA ATTT 
181 ATAACTTTTC TAATATATGA CCAAAATTTG TTGATGTGCA GGTACGCGGG TGCTTACGAC 
241 CGTCAGTCGC GCGAGCGCGA GAATAGCAGT GCAGCAAGCC CAGCGACACA GCGTAGCGCC 
301 AACGAAGACA AGGCGGCCGA CCTTCAGCGC GAAGTCGAGC GCGACGGGGG CCGGTTCAGG 
361 TTCGTCGGGC ATTTCAGCGA AGCGCCGGGC ACGTCGGCGT TCGGGACGGC GGAGCGCCCG 
421 GAGTTCGAAC GCATCCTGAA CGAATGCCGC GCCGGGCGGC TCAACATGAT CATTGTCTAT 
481 GACGTGTCGC GCTTCTCGCG CCTGAAGGTC ATGGACGCGA TTCCGATTGT CTCGGAATTG 
541 CTCGCCCTGG GCGTGACGAT TGTTTCCACT CAGGAAGGCG TCTTCCGGCA GGGAAACGTC 
601 ATGGACCTGA TTCACCTGAT TATGCGGCTC GACGCGTCGC AC AAAGA ATC TTCGCTGAAG 
661 TCGGCGAAGA TTCTCGACAC GAAGAACCTT CAGCGCGAAT TGGGCGGGTA CGTCGGCGGG 
72 1 AAGGCGCCTT ACGGCTTCGA GCTTGTTTCG GAGACGAAGG AGATCACGCG CAACGGCCGA 
781 ATGGTCAATG TCGTCATCAA CAAGTTAGCG CACTCGACCA CTCCCCTTAC CGGACCCTTC 
841 GAGTTCGAGC CCGACGTAAT CCGGTGGTGG TGGCGTGAGA TCAAGACGCA CAAACACCTT 
901 CCCTTCAAGC CGGGCAGTCA AGCCGCCATT CACCCGGGCA GCATCACGGG GCTTTGTAAG 
961 CGCATGGACG CTGACGCCGT GCCGACCCGG GGCGAGACGA TTGGGAAGAA GACCGCTTCA 
1021 AGCGCCTGGG ACCCGGCAAC CGTTATGCGA ATCCTTCGGG ACCCGCGTAT TGCGGGCTTC 
1081 GCCGCTGAGG TGATCTACAA GAAGAAGCCG GACGGCACGC CGACCACGAA GATTGAGGGT 
1141 TACCGCATTC AGCGCGACCC GATCACGCTC CGGCCGGTCG AGCTTGATTG CGGACCGATC 
1201 ATCGAGCCCG CTGAGTGGTA TGAGCTTCAG GCGTGGTTGG ACGGCAGGGG GCGCGGCAAG 
1261 GGGCTTTCCC GGGGGCAAGC CATTCTGTCC GCCATGGACA AGCTGTACTG CGAGTGTGGC 
1321 GCCGTCATGA CTTCGAAGCG CGGGGAAGAA TCGATCAAGG ACTCTTACCG CTGCCGTCGC 
1381 CGGAAGGTGG TCGACCCGTC CGCACCTGGG CAGCACGAAG GCACGTGCAA CGTCAGCATG 
1441 GCGGCACTCG ACAAGTTCGT TGCGGAACGC ATCTTCAACA AGATCAGGCA CGCCGAAGGC 
1501 GACGAAGAGA CGTTGGCGCT TCTGTGGGAA GCCGCCCGAC GCTTCGGCAA GCTCACTGAG 
1561 GCGCCTGAGA AGAGCGGCGA ACGGGCGAAC CTTGTTGCGG AGCGCGCCGA CGCCCTGAAC 
1621 GCCCTTGAAG AGCTGTACGA AGACCGCGCG GCAGGAGCTT ACGACGGACC CGTTGGCAGG 
1681 AAGCACTTCC GGAAGCAACA GGCAGCGCTG ACGCTCCGGC AGCAAGGGGC GGAAGAGCGG 
1 741 CTTGCCGAAC TTGAAGCCGC CGAAGCCCCG AAGTTGCCCC TTGACCAATG GTTCCCCGAA 
1 801 GACGCCGACG CTGACCCGAC CGGCCCTAAG TCGTGGTGGG GGCGCGCGTC AGTAGACGAC 
1 861 AAGCGCGTGT TCGTCGGGCT CTTCGTAGAC AAGATCGTTG TCACGAAGTC GACTACGGGC 
1921 AGGGGGCAGG GAACGCCCAT CGAGAAGCGC GCTTCGATCA CGTGGGCGAA GCCGCCGACC 
1981 GACGACGACG AAGACGACGC CCAGGACGGC ACGGAAGACG TAGCGGCGTA G 
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Sequence: 

1 TGGTGATTTT GTGCCGAGCT GCCGGTCGGG GAGCTGTTGG CTGGCTGGTG GCAGGATATA 
61 TTGTGGTGTA AACAAATTGA CGCTTAGACA ACTTAATAAC ACATTGCGGA CGTCTTTAAT 
121 GTACTGAATT AACATCCGTT TGATACTTGT CTAAAATTGG CTGATTTCGA GTGCATCTAT 
181 GCATAAAAAC AATCTAATGA CAATTATTAC CAAGCAGGAT CACCGGTGCC AGGGCGTGCC 
241 CTTGGGCTCC CCGGGCGCGG CCCGGGCAAT TCCCATCTTG A A AG A A AT AT AGTTTAAATA 
301 TTTATTGATA AAATAAGTCA GGTATTATAG TCCAAGCAAA AACATAATTT ATTGATGCAA 
361 AGTTTAAATT CAGAAATATT TCAATAACTG ATTATATCAG CTGGTACATT GCCGTAGATG 
42 1 AAAGACTGAG TGCGATATTA TGTGTAATAC ATAAATTGAT GATATAGCTA GCTTAGCTCA 
481 TCGGGGGATC CTTAATCGAC TCTAGCTAGA ACGAATTGTT AGGTGGCGGT ACTTGGGTCG 
541 ATATCAAAGT GCATCACTTC TTCCCGTATG CCC A ACTTTG TATAGAGAGC CACTGCGGGA 
601 TCGTCACCGT AATCTGCTTG CACGTAGATC ACATAAGCAC CAAGCGCGTT GGCCTCATGC 
661 TTGAGGAGAT TGATGAGCGC GGTGGCAATG CCCTGCCTCC GGTGCTCGCC GGAGACTGCG 
721 AGATCATAGA TATAGATCTC ACTACGCGGC TGCTCAAACC TGGGCAGAAC GTAAGCCGCG 
78 1 AGAGCGCCA A CAACCGCTTC TTGGTCGAAG GCAGCAAGCG CGATGAATGT CTTACTACGG 
841 AGCAAGTTCC CGAGGTAATC GGAGTCCGGC TGATGTTGGG AGTAGGTGGC TACGTCTCCG 
901 AACTCACGAC CGAAAAGATC AAGAGCAGCC CGCATGGATT TGACTTGGTC AGGGCCGAGC 
961 CTACATGTGC GAATGATGCC CATCCTCGAG AAACGTTTGT AATCGATGGC TTCTGGCTGC 
1021 TCCAGATATA CGGTGGTTTG TGCCGGTTGT GTGCTGGCAA TC'ACCTTGCC GCCACGTACC 
1081 GAATAACGTA CCGGAACCTG ACGGCGCAGC GCATCAAACC CATTTTCAGC CGGCAGGATA 
114! ATCAGGTTGG CGCTGTTTCC GGCGGCAATG CCGTAATCCT GCAAATTCAA CGTCCTTGCG 
1201 CTGTGGTGGG TGATTAAATT CAGGCCATCG TTAATCTGCC CGTAGCCCAT CAACTGGCAA 
1261 ACATGCAGCC CCATATGCAG CACTTGCAGC ATATTCGCCG TTCCCAGCGG ATACCACGGA 
1321 TCGAAGACAT CATCGTGACC AAAGCAGACG TTAATGCCGG ACTCCAGCAT CTCTTTAACG 
1381 CGCGTGATGC CGCGACGTTT TGGATACGTA TCGAAACGTC CTTGCAGATG AATATTGACC 
1441 AGCGGGTTGG CGACAAAGTT AATACCGGAC ATTTTCAGCA AGCGGAACAG GCGTGAGGTA 
1501 TACGCCCCGT TATAGGAGTG CATTGCCGTG GTGTGGCTGG CGGTGACTCG CGCGCCCATG 
1561 CCTTCATGGT GCGCCAGGGC AGCAACGGTT TCGACAAAGC GCGACTGCTC GTCATCGATC 
1621 TCATCACAGT GAACGTCGAT GAGACGGTCG TATTTTTGCG CCAGGGCGAA GGTTTTATGC 
1681 AGCGACTCCA CGCCGTATTC ACGGGTAAAT TCAAAA TGCG GAATCGCCCC CACTACATCT 
1 741 GCCCCTAAGC GTAACGCCTC TTCCAGCAAC GCTTCACCGT TGGGATACGA CAAA ATCCCT 
1801 TCCTGAGGGA AGGCGACGAT TTGCAGATCA ATCCACGGCG CGAC'TTCCTG CTTCACTTCC 
1 861 AGCATTGCTT TCAGCGCAGT TAGCGTTGCA TCCGAAACAT CGACATGGGT ACGCACATGC 
192 1 TGAATGCCGT TGG( AATCTG CCATTTCAGC GTTTGCCATG CGCGTTGTTT C ACATCG TCA 
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1981 TGGGTTAATA ACGCTTTGCG CTCGGCCCAG CGTTCAATGC CTTCAAACAG CGTGCCGGAC 
2041 TGATTCCAGT TCGGTTGTCC GGCGGTTTGC GTGGTGTCCA GGTGAATATG TGGCTCCACA 
2101 A ACGGCGGTA T A ACTA A ACC TTGTTCGGC A TCC AGGCTGT TTTC AGTTAT GGGCATCACG 
2161 CCGGATTGCG CATCAATGGC GCTGATTTTT CCGTCCTGCA GATGAATCTG CCACAGCCCC 
2221 TCTTCGCCTG GTAACCGGGC GTTAATAATT GTTTGTAAAG CGTTATTCGA CACTGTTAGC 
228 1 CTCCCCATGG AGATCTGGAT TGAGAGTGAA TATGAGACTC TAATTGGATA CCGAGGGGA A 
234 1 TTTATGGAAG TCAGTGGAGC ATTTTTGACA AGAAATATTT GCTAGCTGAT AGTGACCTTA 
2401 GGCGACTTTT GAACGCGCAA TAATGGTTTC TGACGTATGT GCTTAGCTCA TTAAACTCCA 
2461 GAAACCCGCG GCTGAGTGGC TCCTTCAACG TTGCGGTTCT GTCAGTTCCA AACGTAAAAC 
2521 GGCTTGTCCC GCGTCATCGG CGGGGGTCAT AACGTGACTC CCTTAATTCT CCGCTCATGA 
2581 TCTTGATCCC CTGCGCCATC AGATCCTTGG CGGCAAGAAA GCCATCCAGT TTACTTTGCA 
2641 GGGCTTCCCA ACCTTACCAG AGGGCGCCCC AGCTGGCAAT TCCGGTTCGC TTGCTGTCCA 
2701 TAAAACCGCC CAGTCTAGCT ATCGCCATGT AAGCCCACTG CAAGCTACCT GCTTTCTCTT 
2761 TGCGCTTGCG TTTTCCCTTG TCCAGATAGC CCAGTAGCTG ACATTCATCC GGGGTCAGCA 
282 1 CCGTTTCTGC GGACTGGCTT TCTACGTGTT CCGCTTCCTT TAGCAGCCCT TGCGCCCTGA 
2881 GTGCTTGCGG CAGCGTGAAG CTTGGCGCGC CAAGCTTGCA TGCCCGCTCT TAGCCGTACA 
2941 ATATTACTCA CCGGTGCGAT GCCCCCCATC GTAGGTGAAG GTGGAAATTA ATGATCCATC 
3001 TTGAGACCAC AGGCCCACAA CAGCTACCAG TTTCCTCAAG GGTCCACCAA AAACGTAAGC 
3061 GCTTACGTAC ATGGTCGATA AGAAAAGGCA ATTTGTAGAT GTTAACATCC AACGTCGCTT 
3121 TCAGGGATCC TTTTTACCGA CAACTCATCC ACATTGATGG TAGGCAGAAA GTTAAAGGAT 
3181 TATCGCA AGT CAATACTTGC CCATTCATTG ATCTATTTAA AGGTGTGGCC TCAAGGAGAT 
3241 CCCCGGGCCG GCAATTCATA TGTCTAGATT AG ATA A A AGT AAAGTGATTA ACAGCGCATT 
3301 AGAGCTGCTT AATGAGGTCG GAATCGAAGG TTTAACAACC CGTAAACTCG CCCAGAAGCT 
3361 AGGTGTAGAG CAGCCTACAT TGTATTGGCA TGTAAAAAAT AAGCGGGCTT TGCTCGACGC 
3421 CTTAGCCATT GAGATGTTAG ATAGGCACCA TACTCACTTT TGCCCTTTAG AAGGGGAAAG 
3481 CTGGCAAGAT TTTTTACGTA ATAACGCTAA AAGTTTTAGA TGTGCTTTAC TAAGTCATCG 
3541 CGATGGAGCA AAAGTACATT TAGGTACACG GCCTACAGAA AAACAGTATG AAACTCTCGA 
3601 AAATCAATTA GCCTTTTTAT GCCAACAAGG TTTTTCACTA GAGAATGCAT TATATGCACT 
3661 CAGCGCTGTG GGGCATTTTA CTTTAGGTTG CGTATTGGAA GATCAAGAGC ATCAAGTCGC 
3721 TAAAGAAGAA AGGGAAACAC CTACTACTGA TAGTATGCCG CCATTATTAC GACAAGCTAT 
3781 CGAATTATTT GATCACCAAG GTGCAGAGCC AGCCTTCTTA TTCGGCCTTG AATTGATCAT 
3841 ATGCGGATTA GAAAAACAAC TTAAATGTGA AAGTGGGTCC GCGTACAGCC GCGCGCGTAC 
3901 GAAAAACAAT TACGGGTCTA CCATCGAGGG CCTGCTCGAT CTCCCGGACG ACGACGCCCC 
3961 CGAAGAGGCG GGGCTGGCGG CTCCGCGCCT GTCCTTTCTC CCCGCGGGAC ACACGCGCAG 
4021 ACTGTCGACG GCCCCCCCGA CCGATGTCAG CCTGGGGGAC GAGCTCCACT TAGACGGCGA 
4081 GGACGTGGCG ATGGCGCATG CCGACGCGCT AGACGATTTC GATCTGGACA TGTTGGGGGA 
4141 CGGGGATTCC CCGGGTCCGG GATTTACCCC CCACGACTCC GCCCCCTACG GCGCTCTGGA 
4201 TATGGCCGAC TTCGAGTTTG AGCAGATGTT TACCGATGCC CTTGGAATTG ACGAGTACGG 
4261 TGGGTAGGGG GCGCGAGGAT CTCGAGCAGC TCGAATTTCC CCGATCGTTC AAACATTTGG 
432 1 CAATA AAGTT TCTTA AG ATT GAATCCTGTT GCCGGTCTTG CG ATGATTAT CATATAATTT 
438 1 CTGTTGAATT ACGTTAAGCA TGTAATAATT AACATGTAAT GCATGACGTT ATTTATGAGA 
4441 TGGGTTTTTA TGATTAGAGT CCCGCAATTA TACATTTAAT ACGC GATAGA AAACAAAATA 
4501 TAGCGCGCAA ACTAGGATAA ATTATCGCGC GCGGTGTCAT CTATGTTACT AGATCGGGAA 
4561 TTCCTTAATT AAGAATTCGA GCTCGGTACC GAGCTCGACT TTCACTTTTC TCTATCACTG 
4621 ATAGGGAGTG GTAAACTCGA CTTTCATTTT CTCTATCACT GATAGGGAGT GGTAAACTCG 
468 1 A CTTTC ACTT TTCTCTATCA CTGATAGGG A GTGGTAA ACT CGACTTTCAC TTTTCTCTAT 
4741 CACGGATAGG GAGTGGTAAA CTCGACTTTC ACTTTTCTCT ATCACTGATA GGGAGTGGTA 
4801 AACTCGACTT TCACTTTTCT CTATCACTGA TAGGGAGTGG TAAACTCGAC TTTCACTTTT 
4861 CTCTATCACT GATAGGGAGT GGTAAACTCG AGATAGAGTG ATCTAGTCTT CGCAAGACCC 
4921 TTTACGTATA TAAGGCCTTT CTAGACATTT GCTCGAGCCC GGGG ATCCAT ATGGCCATGG 
498 1 CACA AGGGGT TGTGACCGGG GTGG ATACGT AAGTTTCTGC TTC TACCTTT GATATATATA 
5041 T A AT A ATT AT CATTAA1 TAG T AGT A AT AT A A TAT TT C A A A TATTTTTTTC A AAATA A AAG 
5101 AATGTAGTAT ATAGCAATTG CITTTCTGTA G'lTTATAAGT GTGTATATTT TAATTTATAA 
5161 CTTTTCTA AT ATATGACCA A AATTTGTTGA TGTGCAGGTA CGCGGGTGCT TACGACCGTC 
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5221 AGTCGCGCGA GCGCGAGAAT TCGAGCGCAG CAAGCCCAGC GACACAGCGT AGCGCCAACG 
5281 AAGACAAGGC GGCCGACCTT CAGCGCGAAG TCGAGCGCGA CGGGGGCCGG TTCAGGTTCG 
5341 TCGGGCATTT CAGCGAAGCG CCGGGCACGT CGGCGTTCGG GACGGCGGAG CGCCCGGAGT 
5401 TCGAACGCAT CCTGAACGAA TGCCGCGCCG GGCGGCTCAA CATGATCATT GTCTATG A CG 
5461 TGTCGCGCTT CTCGCGCCTG AAGGTCATGG ACGCGATTCC GATTGTCTCG GAATTGCTCG 
5521 CCCTGGGCGT GACGATTGTT TCCACTCAGG AAGGCGTCTT CCGGCAGGGA AACGTCATGG 
5581 ACCTGATTCA CCTGATTATG CGGCTCGACG CGTCGCACAA AGAATCTTCG CTGAAGTCGG 
5641 CGAAGATTCT CGACACGAAG AACCTTCAGC GCGAATTGGG CGGGTACGTC GGCGGGAAGG 
5701 CGCCTTACGG CTTCGAGCTT GTTTCGGAGA CGAAGGAGAT CACGCGCAAC GGCCGAATGG 
5761 TCAATGTCGT CATCAACAAG CTTGCGCACT CGACCACTCC CCTTACCGGA CCCTTCGAGT 
5821 TCGAGCCCGA CGTAATCCGG TGGTGGTGGC GTGAGATCAA GACGCACAAA CACCTTCCCT 
5881 TCAAGCCGGG CAGTCAAGCC GCCATTCACC CGGGCAGCAT CACGGGGCTT TGTAAGCGCA 
5941 TGGACGCTGA CGCCGTGCCG ACCCGGGGCG AGACGATTGG GAAGAAGACC GCTTCAAGCG 
6001 CCTGGGACCC GGCAACCGTT ATGCGAATCC TTCGGGACCC GCGTATTGCG GGCTTCGCCG 
6061 CTGAGGTGAT CTACAAGAAG AAGCCGGACG GCACGCCGAC CACGAAGATT GAGGGTTACC 
6121 GCATTCAGCG CG ACCCGATC ACGCTCCGGC CGGTCGAGCT TG ATTGCGGA CCGATC ATCG 
6181 AGCCCGCTGA GTGGTATGAG CTTCAGGCGT GGTTGGACGG CAGGGGGCGC GGCA AGGGGC 
6241 TTTCCCGGGG GCAAGCCATT CTGTCCGCCA TGGACAAGCT GTACTGCGAG TGTGGCGCCG 
6301 TCATGACTTC GAAGCGCGGG GAAGAATCGA TCAAGGACTC TTACCGCTGC CGTCGCCGGA 
6361 AGGTGGTCGA CCCGTCCGCA CCTGGGCAGC ACGAAGGCAC GTGCAACGTC AGCATGGCGG 
6421 CACTCGACAA GTTCGTTGCG GAACGCATCT TC A AC A AG AT CAGGCACGCC GAAGGCGACG 
6481 AAGAGACGTT GGCGCTTCTG TGGGAAGCCG CCCGACGCTT CGGCAAGCTC ACTGAGGCGC 
6541 CTGAGAAGAG CGGCGAACGG G C G A A CCTTG TTGCGGAGCG CGCCGACGCC CTGAACGCCC 
6601 TTGAAGAGCT GTACGAAGAC CGCGCGGCAG GCGCGTACGA CGGACCCGTT GGCAGGAAGC 
6661 ACTTCCGGAA GCAACAGGCA GCGCTGACGC TCCGGCAGCA AGGGGCGGAA GAGCGGCTTG 
6721 CCGAACTTGA AGCCGCCGAA GCCCCGAAGC TTCCCCTTGA CCAATGGTTC CCCGAAGACG 
6781 CCGACGCTGA CCCGACCGGC CCTAAGTCGT GGTGGGGGCG CGCGTCAGTA GACGACAAGC 
6841 GCGTGTTCGT CGGGCTCTTC GTAGACAAGA TCGTTGTCAC GAAGTCGACT ACGGGCAGGG 
6901 GGCAGGGAAC GCCCATCGAG AAGCGCGCTT CGATCACGTG GGCGAAGCCG CCGACCGACG 
6961 ACGACGAAGA CGACGCCCAG GACGGCACGG AAGACGTAGC GGCGTAGCTG CAGCTCGACG 
7021 CATGCCCTGC TTTAATGAGA TATGCGAGAC GCCTATGATC GCATGATATT TGCTTTCAAT 
7081 TCTGTTGTGC ACGTTGTAAA AAACCTGAGC ATGTGTAGCT CAGATCCTTA CCGCCGGTTT 
7141 CGGTTCATTC TA ATG A ATAT ATCACCCGTT ACTATCGTAT TTTTATG A AT A ATATTCTCC 
7201 GTTCAATTTA CTGATTGTCC AAGCTTCCTG CAGGAAGCTT TGGGCGGATC CTCTAGATTC 
7261 GACGGTATCG ATAAGCTCGC GGATCCCTGA AAGCGACGTT GGATGTTAAC ATCTACAAAT 
7321 TGCCTTTTCT TATCGACCAT GTACGTAAGC GCTTACGTTT TTGGTGGACC CTTGAGGAAA 
7381 CTGGTAGCTG TTGTGGGCCT GTGGTCTCAA GATGGATCAT TAATTTCCAC CTTCACCTAC 
7441 GATGGGGGGC ATCGCACCGG TGAGTAATAT TGTACGGCTA A G A G CG A ATT TGGCCTGTAG 
7501 GATCCCTGAA AGCGACGTTG GATGTTAACA TCTACAAATT GC CTTTTCTT ATCGACCATG 
7561 TACGTAAGCG CTTACGTTTT TGGTGGACCC TTGAGGA A AC TGGTAGCTGT TGTGGGCCTG 
7621 TGGTCTCAAG ATGGATCATT AATTTCCACC TTCACCTACG ATGGGGGGCA TCGCACCGGT 
7681 GAGTAATATT GTACGGCTAA GAGCGAATTT GGCCTGTAGG ATCCCTGAAA GCGACGTTGG 
7741 ATGTTAACAT CTACAAATTG C CTTTTCTT A TCGACCATGT ACGTAAGCGC TTACGTTTTT 
7801 GGTGGACCCT TGAGGAAACT GGTAGCTGTT GTGGGCCTGT GGTCTCAAGA TGGATCATTA 
7861 ATTTCCACCT TCACCTACGA TGGGGGGCAT CGCACCGGTG AGTAATATTG TACGGCTAAG 
7921 AGCGAATTTG GCCTGTAGGA TCCGCGAGCT GGTCAATCCC A TTGCTTTTG AAGCAGCTCA 
7981 ACATTGATCT CTTTCTCGAT CGAGGGAGAT TTTTCAAATC AGTGCGCAAG ACGTGACGTA 
8041 AGTATCCGAG TCAGTTTTTA TTTTTCTACT AATTTGGTCG TTTATTTCGG CGTGTAGGAC 
8101 ATGGCAACCG GGCCTGAATT TCGCGGGTAT TCTGTTTCTA TTCCAACTTT TTCTTGATCC 
8161 GCAGCCATTA ACGACTTTTG AATAGATACG CTGACACGCC AAGCCTCGCT AGTCAAAAGT 
8221 GTACCA A ACA ACGCTTTACA GCAAGAACGG AATGCGCGTG ACGCTCGCGG TGACGCCATT 
8281 TCG CCTTTTC A G A A A TG G A T A A A T A G C C IT G CTICCTA TT A T A TCTTCC C A A A TT A CC A A 
8341 T AC ATT AC AC TAGCATCTGA ATTTCATAAC CA ATCTCGAT AC AC C AAATC GAAGATCCA A 
8401 GG AG AT AT A A C A ATG A AG AC TAATC TTTTT C TCT TTCTCA TC 1 III CACT TCTCCTATCA 
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8461 TTATCCTCGG CCGAATTGTA CGTA AGTTTC TGCTTCTACC TTTGATATAT A TATA AT A AT 
8521 TATCATTAAT TAGTAGTAAT ATAATATTTC AAATATTTTT TTCAAAATAA AAGAATGTAG 
8581 TATATAGCAA TTGCTTTTCT GTAGTTTATA AGTGTGTATA TTTT A A TTT A TAACTTTTCT 
8641 AATATATGAC CAAAATTTGT TGATGTGCAG GTACAATTCA GTAAAGGAGA AGAACTTTTC 
8701 ACTGGAGTTG TCCCAATTCT TGTTGAATTA GATGGTGATG TTAATGGGCA CAAATTTTCT 
8761 GTCAGTGGAG AGGGTGAAGG TGATGCAACA TACGGAAAAC TTACCCTTAA ATTTATTTGC 
8821 ACTACTGGAA AACTACCTGT TCCATGGCCA ACACTTGTCA CTACTTTCAC TTATGGTGTT 
8881 CAATGCTTTT CAAGATACCC AGATCATATG AAGCGGCACG ACTTCTTCAA GAGCGCCATG 
8941 CCTGAGGGAT ACGTGCAGGA GAGGACCATC TCTTTCAAGG ACGACGGGAA CTACAAGACA 
9001 CGTGCTGAAG TCAAGTTTGA GGGAGACACC CTCGTCAACA GGATCGAGCT TAAGGGAATC 
9061 GATTTCAAGG AGGACGGAAA CATCCTCGGC CACAAGTTGG A ATA CA ACT A CAACTCCCAC 
9121 AACGTATACA TCACGGCAGA CAAACAAAAG AATGGAATCA AAGCTAACTT CAAAATTAGA 
9181 CACAACATTG AAGATGGAAG CGTTCAACTA GCAGACCATT ATCAACAAAA TACTCCAATT 
9241 GGCGATGGCC CTGTCCTTTT ACCAGACAAC CATTACCTGT CCACACAATC TGCCCTTTCG 
9301 AAAGATCCCA ACGAAAAGAG AGACCACATG GTCCTTCTTG AGTTTGTAAC AGCTGCTGGG 
9361 ATTACACATG GCATGGATGA ACTATACAAA CATGATGAGC TTTAAGAGCT CG A ATTTCCC 
9421 CGATCGTTCA AACATTTGGC AATAAAGTTT CTTAAGATTG AATCCTGTTG CCGGTCTTGC 
9481 GATGATTATC ATATAATTTC TGTTGAATTA CGTTAAGCAT GTAATAATTA ACATGTAATG 
9541 CATGACGTTA TTTATGAGAT GGGTTTTTAT GATTAGAGTC CCGCAATTAT ACATTTAATA 
9601 CGCGATAGAA AACAAAATAT AGCGCGCAAA CTAGGATAAA TTATCGCGCG CGGTGTCATC 
9661 TATGTTACTA GATCGGGAAT TCGCGATCGC CCCAACTGGG GTAACCTTTG AGTTCTCTCA 
9721 GTTGGGGGAG ATCTGATTGT CGTTTCCCGC CTTCAGTTTA AACTATCAGT GTTTGACAGG 
9781 ATATATTGGC GGGTAAACCT AAGAGAAAAG AGCGTTTATT AGAATAATCG GATATTTAAA 
9841 AGGGCGTGAA AAGGTTTATC CGTTCGTCCA TTTGTATGTC 




FIGURE 9 

Nucleotide sequence of Arabidopsis thaliana GA4H promoter region 

1 TGTAAATGAT AGGGATTGAA ACATCATCCT ATCGTTGACC AAAAATTTCA CTGCGTGCTA 
61 TATA A A ATA C TATATATGTT ACCCTTTAAC TGATGAAAAT GTAAAGAGAC AAGGCAGCAC 
1 2 1 CGTTTATCAT CAGACCAGTT TCGAG AGTGT TCCTGC ATCG TTGGGCTCCC TCCTC AATTT 
1 8 1 TGTCTACGTG ATTATATATC ATATCGTCTA CAAACAAAAT A A AT AC A ATT CTATCATATG 
24 1 A ATATGTG AT CATCG ATG AT CG ATCA ATAT ATGTTTTCGA GGTG ACGTAT ATAGTATATT 
301 TCCGTAGAGA CGGCGAAGAA CATGATATCT CTGCATGCCT CCAATCAAAT CTTTACACTT 
361 CATCCTTCTT CGTTACTTGT TCAGTTGTTC CTTTCTAATC CCGACAACCC TTAATTTGTA 
42 1 TTTCTATATT AG ATCG A A AT ATCTCATTTG TGATAAATAA AATAAA AAAA ATCAAAGAAA 
481 GCTATAGAGA AGCTGCGTGC ATGCATGGGT TGGCGATGTT TGGCTTGTTA TGTTTGGCTT 
541 GTTATGTGGC ATTATCTGTA TGTATATTAC CCTAAATCAC ATCTACGACA TTTCCCTCGA 
601 TCTTCAAAAT ATGCCAGCAA TCTTCATGTT TCCTCATATC TCTTAACATT GGAAAATGTC 
661 TTTTGACCTC TTTTGATGTA TTTTAAATTA CTTCGAGCTC ATCTATATTA CAAATCATTC 
721 ATGGTGAATT ATTGTCCAGC CAATAGAATA GAAATCTGAA TATAATGTGT ACCACATCTT 
78 1 TTATGTAATT TATACGATAT TCTTTTTCCT GAGAATGATC AA ATAACAAC ATGCATGAAT 
841 TGCTGCCAGA AAACGTCAGA TTGATCAGTT ATCACTACAA TTATCAATTA ACTAGTAAAT 
901 AGTATCAAAA TGTACGTAGT GCCCATCTAT AGCTAGCTAA GGAGGACTCC GGATGTAGAG 
961 AAAAGCTAAA ATGTGACTTG CTAGAGTTGT ATTATATTGA ATTTTCTAAA CTAATAGTAT 
1 02 1 C TTTTTT A C A G ATA ATA AIT TCCGGA AA AC CTATTAGATG TATAGATATA ACA ATA AGCA 
1081 TCGATACCAA CCTTTTACTT CCAAAAAAAA ATAAAAAAAA AATGCCAAGA TG AG ATA ATT 
1141 TTGTC AATTT CAATTAGTGG GA A A ATA ACA ATTGTCGTGT TATTTTTGAA CCAACGCATC 
1 20 1 TCAGTGAATG ATTTCCCAGT TCTTAAGATT TTAGGACATA CTTTCCCAGT A ACATCTAAT 
1261 CCGTTTGGGC ATAAACAAGA CAATTTGTAG TTATGTACAT TTCTTAGTGA TGTGTGTTGA 
1 32 1 AAAGATATGA ATCAATGAGG TCCGACATAT TTTGTCA ATA CGTTAGTGGT GTTTCAA AAT 
1381 AAATTTTTAG TATATATATT AAAATAAGAC CAAAGGATAG GCTTTAGTGG TGTTTCAGGT 
1441 ATAGTTTTAA TA ATCA ATT C AAAATAAGTC GAAAGGATAT GTAAGATAGG CGTTATTTCA 
1501 ACGTGGATCA TTATCAACCA TGTCAAAAAC GCATTTCAAC TCCTAGATGT GTTGTTAGTT 
1561 ATATATGTTC CAAATGGAAT CGACCCAACA G AAA A AG AG A AAAAAACGTA AAAGGTTATG 
1621 CGATTCCAGG GACGTCTCAT ATATATATAT ATTCCGATGA AATATAAATA TAATTATCGT 
1681 GGTCTGTGAC AATAAATATG GAAATAGATG TGGAAATCAT GATCATGTGA AG A AG A AG A A 
1 74 1 GAACACGTGC AGATGAACTG CAAATGATA A TAATGTGCAT GTCCATGAGT TATGTACTTA 
1 801 TGTGTATTAT CTACGTGTTT TCCATACATA C ATA TATA A A TCTTATATTA CTTTATGGTT 
1 86 1 TTGTCGTA AA AGTTACGTAG CATCAATAAT TGTGATTCGT TGCCATA A AC AG ACA ACTA C 
1921 TTGTAACGGT ATAAGGCTTG GCTCTCATGA TAAAATGATA ACCCTTTTTT TCGTCGGAGA 
1981 CAGACAAACG CATAAATCAC TAATTCTAAA CCGAGATGAT TGTCGATTTG TTTGCCATAT 
2041 GCATAACTAG AATCTTCAGT TAATATTAAT TTTTGGTGTT TTCGATCGAA TAAAAAAAAA 
2101 TAAACATTGC AATATTTCGA AATTTGTCGT CTTTCTTTTT ATAACACTAG CAAGTGAGAG 
2161 GCTGAGAGCC AAGTGGAACG TTAAAAGACA ACATTAGATA TATA TTA TAT ATTGCTAAAT 
2221 CTGTATTATT TCTTTTTAAC ATACGCAACT TTTGATTGGA A ATCGTA AGT CGAAGGAAGG 
2281 GCCTCGATTT ATGACGTACG CTTCGTGCCA A A C A A TTCCT CTTTAGTTGA GGCCGGGGAA 
2341 GACGAGTTTG TTGTTAGTGA GCGATGCCAT GGC ATCA ATG A ACTCCCAA A GGCCATATGT 
2401 TCTGTTAAAG GCTATTTTAG TTTTT AATTT TGTTCTGATT AACTCA ACCA CATGTTAAAT 
2461 CAGATATCAT GTTTAACGAT ATTAGTTTTT A A ACA A A ATG ATTATCATA A AACGA AATTT 
2521 ATG ATG A A AC ATATATAATC TTTATCTTGT TTA AGTATGT AATTCTTGTA TGTTTGTATA 
2581 CGCCTTGCAA ATCA AAAA AC TAGTTGCTGT TTTTGGCATT GTGTTTACGA AATATTTATT 
2641 AATATTTTAA ATTAATTAAA TAAATGTTCT TATTTCTCAA CAGGAAACAA TATGTATTTT 
2701 CTTTCTTTAT AAAATTACAA TGAATTATTT GTTTTAAGCT GTCTATTTCC AAGAAACAAA 
2761 ACACAAAAAT GATAAATTTA TAATAGTCAC ATAACCTGTC TTA CA A AAAA AA A AAGAA AA 
2821 GCGAAAAGAA ATGTGACAAC AGAAAATGGT TTTGATAACC AATAAGAATC GACAAAAAAA 
2881 A AACTTACTC CACATATACT CTTCTCTTCA CTCTTCAGTC TTCA CTATTC AGTCTCGAGT 
2941 ATTTCACCGA TCTATAAATA CACTCCTCTT CTCCACCAAA AGTA ICATAT CAT ACCA AAA 
3001 A C A T A A A G CC A A A A T ATA A A C A C A T A A G CC TTTTA 



